This chapter introduces two techniques for documenting business processes- Data Flow Diagrams (DFD's) and System Flowcharting (SFC). It discusses how to read these documents, their purpose, and guidelines for creating DFD's and SFC's from narratives (see dicussion)
Data Flow Diagrams (DFD):
The purpose and value of the data flow diagram is primarily data discovery, not process mapping. Data flow diagrams can be used to provide a clear representation of any business function. The technique starts with an overall picture of the business and continues by analyzing each of the functional areas of interest. This analysis can be carried out to precisely the level of detail required. The technique exploits a method called top-down expansion to conduct the analysis in a targeted way.
- A data flow diagram (DFD) is a graphical representation of the "flow" of data through an information system. A data flow diagram can also be used for the visualization of data processing (structured design). It is common practice for a designer to draw a context-level DFD first which shows the interaction between the system and outside entities. This context-level DFD is then "exploded" to show more detail of the system being modeled. All data flow must begin and/or end with a process because data flow either initiate or result from a process. Benefits of DFD:
- Uncover misunderstandings of a system processes
- Help communicate analyst system understanding to management/end users
- Helps to take the system out of context (eases the problem of thinking outside the box)
There are several common modeling rules to follow when creating DFDs:
All processes must have at least one data flow in and one data flow out.
All processes should modify the incoming data, producing new forms of outgoing data.
Each data store must be involved with at least one data flow.
Each external entity must be involved with at least one data flow.
A data flow must be attached to at least one process.
Identifying the existing business processes, using a technique like data flow diagrams, is an essential precursor to business process re-engineering, migration to new technology, or refinement of an existing business process. However, the level of detail required will depend on the type of change being considered.
Rules Governing DFD Construction
A process cannout have only outputs -"miracle"
A process cannot have only inputs -"black hole"
The inputs to a process must be sufficient to produce the outputs from the process - (gray hole)
All data stores must be connected to at least one process
A data store cannot be connected to a source or a sink
A data flow can have only one direction of flow. Multiple data flows to and/or from the same process and data stores must be shown by separate arrows.
If the exact same data flows to two separate processes, it should be represented by a forked arrow
Data cannot flow directly back into the process it has just left
All data flows must be names using a noun phrase
The introduction f the DATA FLOW DIAGRAM in the late 1970s provided individuals with the ability to map the flow of data throughout the organization. Through a simple process of creating the DFD, the ability to understand what processes are taking place within the organization came to be.
In the late 1970s data-flow diagrams (DFDs) were introduced and popularized for structured analysis and design (Gane and Sarson 1979). DFDs show the flow of data from external entities into the system, showed how the data moved from one process to another, as well as its logical storage. Figure 1 presents an example of a DFD using the Gane and Sarson notation. There are only four symbols:
Squares representing external entities, which are sources or destinations of data.
Rounded rectangles representing processes, which take data as input, do something to it, and output it.
Arrows representing the data flows, which can either be electronic data or physical items.
Open-ended rectangles representing data stores, including electronic stores such as databases or XML files and physical stores such as or filing cabinets or stacks of paper.
To create the diagram I simply worked through a usage scenario, in this case the use case logic described in the Enroll in University system use case. On actual projects it’s far more common just to stand at a whiteboard with one or more project stakeholders and simply sketch as we talk through a problem.
In this case I started with the applicant, the external entity in the top left corner, and simply followed the flow of data throughout the system. I introduced the Inspect Forms process to encapsulate the initial validation steps. I assigned this process identifier 1.0, indicating that it’s the first process one the top level diagram. A common technique with DFDs is to create detailed diagrams for each process to depict more granular levels of processing. Were I to do this for this process I would number the subprocesses 1.1, 1.2, and so on. Subprocesses of 1.1 would be numbered 1.1.1, 1.1.2, and so on. I wouldn’t bother to expand this process to more detailed DFD as it is fairly clear what is happening in it and therefore the new diagram wouldn’t add any value. I also indicated who/what does the work in the bottom section of the process bubble, in this case the registrar. This information is optional although very useful in my experience. You can see how the improperly filled out forms are returned to the applicant if required.
I then continued to follow the logic of the use case, concentrating on how the data is processed by each step. The second process encapsulates the logic for creating a student record, including the act of checking to see it the person is eligible to enroll as well as if they’re already in the database. Notice how each data flow on the diagram has been labeled. Also notice that the names of the data change to reflect how it’s been processed.
Now that I look closer at the diagram the arrow between the Input Student Information process and the StudentDB data store should be two-way because this process searches the database for existing student records. Unfortunately I’ve erased this diagram from my whiteboard so it isn’t easy to address this minor problem. Yes, I could use a drawing program to update the arrowhead but its more important to make the point that agile models don’t need to be perfect, they just need to be good enough. AM recommends that you follow the practice Update Models Only When it Hurts and in this case this issue doesn’t hurt enough to invest the two or three minutes it would take to fix the diagram.
The Collect Fees process is interesting because it interacts with an electronic data store, Financial DB, as well as a physical one, Cash Drawer. DFDs can be used to model processes that are purely physical, purely electronic, or more commonly a mix of both. Electronic data stores can be modeled via data models, particularly if they represent a relational database. Physical data stores are typically self explanatory.
There are several common modeling rules that I follow when creating DFDs:
All processes must have at least one data flow in and one data flow out.
All processes should modify the incoming data, producing new forms of outgoing data.
Each data store must be involved with at least one data flow.
Each external entity must be involved with at least one data flow.
A data flow must be attached to at least one process.
Although many traditional methods have a tendency to apply DFDs in dysfunctional ways it is still possible to do so in an agile manner as well. Keep your diagrams small, as I did above. Use simple tools, such as whiteboards, to create them with your stakeholders. Travel light and erase them when you’re through with them. Create them if they’re going to add value, not simply because your process tells you to do so. The bottom line is that some of the modeling methodologies may have been flawed but the need to represent the data flow within a system is still required.
Types of Dataflow Diagrams (define and explain the differences):
Physical - graphical representation of a system showing the system's internal and external entities, and the flows of data into and out of these entities. An internal entity is an entity (place-department, person accounting clerk, thing- computer) within the system that transforms data. An external entity is an entity (place, person, thing) outside the system that send data to, or receive data from, the system.
Physical DFDs specify where, how and by whom a system's processes are acomplished.
Logical - graphical representation of a system showing the system's processes (bubbles), data stores, and the flows of data into and out of the processes and data stores. Logical DFD is used to document information systems because it can represent the logical nature of a system.
Differences between physical and logical DFDs are the following:
- A physical DFD does not tell us what is being accomplished, whereas logical DFD shows us what tasks the system is doing without having to specify how, where, or by whom the tasks are acomplished;
- The advantage of a logical DFD is that we can concentrate on the functions that system performs. So, a logical DFD portrays a system's activities, whereas a physical DFD depicts a system's infrastructure. Both pictures are needed to understand the system completely;
- In the physical DFD the processes are labeled with the nouns, whereas in the logical DFD the processes are labeled with verbs that describe the actions being performed.
Levels of Dataflow Diagrams:
Context: a top-level, or least detailed, diagram of an information system that depicts the system and all of its activities as a single bubble, and shows the data flows into and out of the system and into and out of the external entities.
Level-0: A top-level view of the single bubble shown in the context diagram. Level-0 illustrates the MAJOR subdivisions of the process represented by the context diagram, and are labeled "1.0, 2.0, 3.0, etc."
Level-1: illustrates the major subdivisions of each process conducted in the Level-0 diagram, this is referred to as top-down partitioning. Labels for this step are "1.1, 1.2, 2.1, 2.2, 2.3, 3.1, 3.2, etc."
Level-x (How low can you go?) Top-down partitioning can occur until processes are broken down to a logical point. You must not break down a system so that the inputs and the outputs of a certain process are the same.
Balanced Data Flow Diagram set- A balanced DFD set means the number of outputs and the number of inputs from/to the external entities is the same in each level.
Data Flow Diagram Symbols:
There are only four symbols that are used in the drawing of business process diagrams (data flow diagrams).
Bubble (Physcial or Logical)
>
* Arrow
Box
Open-ended Rectangle || ||
> Process (Bubble) – depicts an entity or a process within which incoming data flows are transformed into outgoing data flows. External Entity (Box) – external entity symbol portrays a source or a destination of data outside the system. Data Flow (Arrow) - data flow shows the flow of information from its source to its destination. A data flow is represented by a line, with arrowhead showing the direction of flow. Information always flows to or from a process and may be written, verbal or electronic. Each data flow may be referenced by the processes or data stores at its head and tail, or by a description of its contents. Data Store (Open-ended Rectangle) - A data store is a holding place for information within the system. Data stores may be long-term files such as sales ledgers, or may be short-term accumulations: for example batches of documents that are waiting to be processed. Each data store should be given a reference followed by an arbitrary number.
Preparing Systems Documentation:
Table of Entities and Activities (describe the steps for creation):
The table of entities and activities will lead to quicker and more accurate preparation of DFDs and a systems flowchart because it clarifies the information contained in a narrative and helps us to document the system correctly.
To begin the table, go through the narrative line by line and circle each activity being performed. An activity is any action being performed by an internal or external entity. Activities can include actions related to data (originate, transform, file, or receive) or to an operations process. Operations process activities might include picking goods in the warehouse, inspecting goods at the receiving dock, or counting cash. For each activity there must be an entity that performs the activity. As you circle each activity, put a box around the entity that performs the activity.
Now that you are ready to prepare your table, list each activity in the order that it is performed, regardless of the sequence in which it appears in the narrative. After you have listed all activities, consecutively number each activity.
The narrative may refer to some entities in more than one way; however, it is better to list doubtful activities than to miss an activity. pg. 119
Drawing the Context Diagram
DFD Guideline 1: Include within the system context (bubble) any entity that performs one or more information processing activites. Information processing activites are activiteis that retrive data form storage, transform data, or file data. Examples include document preparation, data entry, verification, classification, sorting, calculation, summarizing, and filing. Sending and recieving of data between entities is not included because data is not transformed. Also, any entity that does not perform information processing will be an external entity.
DFD Guideline 2: Include only normal processing routines, not exception routines or error rountines, on context diagrams, physical DFDs, and level 0 logical DFDs.
DFD Guideline 3: Include in the systems documentation all (and only) activities and entities described in the systems narrative, no more no less.
DFD Guideline 4: When multiple entities operate identically, depict only one to represent all.
Preparing Systems Flowcharts
Systems flowcharting guideline 1:
Divide the flowchart into columns: one column for each internal entity and one for each external entity. Label each column. Systems flowcharting guideline 2:
Flowchart columns should be laid out so that the flowchart activities flow from left to right, but you should locate columns so as to minimize crosed lines and connectors. Systems flowcharting guideline 3:
Flowchart logic should flow from top to bottom and from left to right. For clarity, put arrow on all flow lines. Systems flowcharting gjuideline 4:
Keep the flowchart on one page. If you can't, use multiple pages and connect the pages with off-page connectors. Systems flowcharting guideline 5:
Within each column, there must be at least one manual process, keying operation, or data store between documents. That is, do not directly connecto documents within the same column. This guideline suggests that you show all the processing that is taking place. For example, if two documents are being attached, include a manual process to show the matching and attaching activities. Systems flowcharting guideline 6:
When crossing organizational lines (I.e., moving from one column to another), show a document at both ends of the flow line unless the connection is so short that the intent is unambiguous. Systems flowcharting guideline 7:
Documents or reports printed in a computer facility should be shown in that facility's column first. You can then show the document or report giong to the destination unit. Systems flowcharting guideline 8:
Documents or reports printed by a centralized computer facility on equipment located in another organizational unit (e.g., a warehouse or a shipping department) should not be shown whithin the computer facility. Systems flowcharting guideline 9:
Processingwithin an organizational unit on devices such as a PC or computerized cash register should be shown within the unit or as a separate column next to that unit, but not in the central computer facility column. Systems flowcharting guideline 10:
Sequential processing steps (either computerized or manual) with no delat between them (and resulting from the same input) can be shown as one process or as a sequence of processes. Systems flowcharting guideline 11:
The only way to get data into or out of a computer data storage unit is through a computer processing rectangle.
For example, if you key data from a source document, you must show a manual keying ymbol, a rectangle or aquare, and then a computer storage unit. Systems flowcharting guideline 12:
A manual process is not needed to show the sending of a document. The sending should be apparent from the movement of the document itself. Systems flowcharting guideline 13:
Do not use a manual process to file a document. Just show the document going into the file.
These are the basic system flowchart icons. Many sources on the internet demonstrate proprietary symbols, but this seems to be consistent with our textbook.
For free trials of diagramming software, go to any of the following websites:
Documenting Information Systems:
This chapter introduces two techniques for documenting business processes- Data Flow Diagrams (DFD's) and System Flowcharting (SFC). It discusses how to read these documents, their purpose, and guidelines for creating DFD's and SFC's from narratives (see dicussion)Data Flow Diagrams (DFD):
The purpose and value of the data flow diagram is primarily data discovery, not process mapping. Data flow diagrams can be used to provide a clear representation of any business function. The technique starts with an overall picture of the business and continues by analyzing each of the functional areas of interest. This analysis can be carried out to precisely the level of detail required. The technique exploits a method called top-down expansion to conduct the analysis in a targeted way.- A data flow diagram (DFD) is a graphical representation of the "flow" of data through an information system. A data flow diagram can also be used for the visualization of data processing (structured design). It is common practice for a designer to draw a context-level DFD first which shows the interaction between the system and outside entities. This context-level DFD is then "exploded" to show more detail of the system being modeled. All data flow must begin and/or end with a process because data flow either initiate or result from a process.
Benefits of DFD:
- Uncover misunderstandings of a system processes
- Help communicate analyst system understanding to management/end users
- Helps to take the system out of context (eases the problem of thinking outside the box)
There are several common modeling rules to follow when creating DFDs:
- All processes must have at least one data flow in and one data flow out.
- All processes should modify the incoming data, producing new forms of outgoing data.
- Each data store must be involved with at least one data flow.
- Each external entity must be involved with at least one data flow.
- A data flow must be attached to at least one process.
Identifying the existing business processes, using a technique like data flow diagrams, is an essential precursor to business process re-engineering, migration to new technology, or refinement of an existing business process. However, the level of detail required will depend on the type of change being considered.Rules Governing DFD Construction
The introduction f the DATA FLOW DIAGRAM in the late 1970s provided individuals with the ability to map the flow of data throughout the organization. Through a simple process of creating the DFD, the ability to understand what processes are taking place within the organization came to be.
In the late 1970s data-flow diagrams (DFDs) were introduced and popularized for structured analysis and design (Gane and Sarson 1979). DFDs show the flow of data from external entities into the system, showed how the data moved from one process to another, as well as its logical storage. Figure 1 presents an example of a DFD using the Gane and Sarson notation. There are only four symbols:
To create the diagram I simply worked through a usage scenario, in this case the use case logic described in the Enroll in University system use case. On actual projects it’s far more common just to stand at a whiteboard with one or more project stakeholders and simply sketch as we talk through a problem.
In this case I started with the applicant, the external entity in the top left corner, and simply followed the flow of data throughout the system. I introduced the Inspect Forms process to encapsulate the initial validation steps. I assigned this process identifier 1.0, indicating that it’s the first process one the top level diagram. A common technique with DFDs is to create detailed diagrams for each process to depict more granular levels of processing. Were I to do this for this process I would number the subprocesses 1.1, 1.2, and so on. Subprocesses of 1.1 would be numbered 1.1.1, 1.1.2, and so on. I wouldn’t bother to expand this process to more detailed DFD as it is fairly clear what is happening in it and therefore the new diagram wouldn’t add any value. I also indicated who/what does the work in the bottom section of the process bubble, in this case the registrar. This information is optional although very useful in my experience. You can see how the improperly filled out forms are returned to the applicant if required.
I then continued to follow the logic of the use case, concentrating on how the data is processed by each step. The second process encapsulates the logic for creating a student record, including the act of checking to see it the person is eligible to enroll as well as if they’re already in the database. Notice how each data flow on the diagram has been labeled. Also notice that the names of the data change to reflect how it’s been processed.
Now that I look closer at the diagram the arrow between the Input Student Information process and the Student DB data store should be two-way because this process searches the database for existing student records. Unfortunately I’ve erased this diagram from my whiteboard so it isn’t easy to address this minor problem. Yes, I could use a drawing program to update the arrowhead but its more important to make the point that agile models don’t need to be perfect, they just need to be good enough. AM recommends that you follow the practice Update Models Only When it Hurts and in this case this issue doesn’t hurt enough to invest the two or three minutes it would take to fix the diagram.
The Collect Fees process is interesting because it interacts with an electronic data store, Financial DB, as well as a physical one, Cash Drawer. DFDs can be used to model processes that are purely physical, purely electronic, or more commonly a mix of both. Electronic data stores can be modeled via data models, particularly if they represent a relational database. Physical data stores are typically self explanatory.
There are several common modeling rules that I follow when creating DFDs:
- All processes must have at least one data flow in and one data flow out.
- All processes should modify the incoming data, producing new forms of outgoing data.
- Each data store must be involved with at least one data flow.
- Each external entity must be involved with at least one data flow.
- A data flow must be attached to at least one process.
Although many traditional methods have a tendency to apply DFDs in dysfunctional ways it is still possible to do so in an agile manner as well. Keep your diagrams small, as I did above. Use simple tools, such as whiteboards, to create them with your stakeholders. Travel light and erase them when you’re through with them. Create them if they’re going to add value, not simply because your process tells you to do so. The bottom line is that some of the modeling methodologies may have been flawed but the need to represent the data flow within a system is still required.Types of Dataflow Diagrams (define and explain the differences):
- Physical - graphical representation of a system showing the system's internal and external entities, and the flows of data into and out of these entities. An internal entity is an entity (place-department, person accounting clerk, thing- computer) within the system that transforms data. An external entity is an entity (place, person, thing) outside the system that send data to, or receive data from, the system.
Physical DFDs specify where, how and by whom a system's processes are acomplished.- Logical - graphical representation of a system showing the system's processes (bubbles), data stores, and the flows of data into and out of the processes and data stores. Logical DFD is used to document information systems because it can represent the logical nature of a system.
Differences between physical and logical DFDs are the following:- A physical DFD does not tell us what is being accomplished, whereas logical DFD shows us what tasks the system is doing without having to specify how, where, or by whom the tasks are acomplished;
- The advantage of a logical DFD is that we can concentrate on the functions that system performs. So, a logical DFD portrays a system's activities, whereas a physical DFD depicts a system's infrastructure. Both pictures are needed to understand the system completely;
- In the physical DFD the processes are labeled with the nouns, whereas in the logical DFD the processes are labeled with verbs that describe the actions being performed.
Levels of Dataflow Diagrams:
Data Flow Diagram Symbols:
There are only four symbols that are used in the drawing of business process diagrams (data flow diagrams).- Bubble (Physcial or Logical)

>* Arrow
- Box

- Open-ended Rectangle || ||
>Process (Bubble) – depicts an entity or a process within which incoming data flows are transformed into outgoing data flows.
External Entity (Box) – external entity symbol portrays a source or a destination of data outside the system.
Data Flow (Arrow) - data flow shows the flow of information from its source to its destination. A data flow is represented by a line, with arrowhead showing the direction of flow. Information always flows to or from a process and may be written, verbal or electronic. Each data flow may be referenced by the processes or data stores at its head and tail, or by a description of its contents.
Data Store (Open-ended Rectangle) - A data store is a holding place for information within the system. Data stores may be long-term files such as sales ledgers, or may be short-term accumulations: for example batches of documents that are waiting to be processed. Each data store should be given a reference followed by an arbitrary number.
Preparing Systems Documentation:
Table of Entities and Activities (describe the steps for creation):
The table of entities and activities will lead to quicker and more accurate preparation of DFDs and a systems flowchart because it clarifies the information contained in a narrative and helps us to document the system correctly.To begin the table, go through the narrative line by line and circle each activity being performed. An activity is any action being performed by an internal or external entity. Activities can include actions related to data (originate, transform, file, or receive) or to an operations process. Operations process activities might include picking goods in the warehouse, inspecting goods at the receiving dock, or counting cash. For each activity there must be an entity that performs the activity. As you circle each activity, put a box around the entity that performs the activity.
Now that you are ready to prepare your table, list each activity in the order that it is performed, regardless of the sequence in which it appears in the narrative. After you have listed all activities, consecutively number each activity.
The narrative may refer to some entities in more than one way; however, it is better to list doubtful activities than to miss an activity. pg. 119
Drawing the Context Diagram
DFD Guideline 1: Include within the system context (bubble) any entity that performs one or more information processing activites. Information processing activites are activiteis that retrive data form storage, transform data, or file data. Examples include document preparation, data entry, verification, classification, sorting, calculation, summarizing, and filing. Sending and recieving of data between entities is not included because data is not transformed. Also, any entity that does not perform information processing will be an external entity.
DFD Guideline 2: Include only normal processing routines, not exception routines or error rountines, on context diagrams, physical DFDs, and level 0 logical DFDs.
DFD Guideline 3: Include in the systems documentation all (and only) activities and entities described in the systems narrative, no more no less.
DFD Guideline 4: When multiple entities operate identically, depict only one to represent all.
Preparing Systems Flowcharts
Systems flowcharting guideline 1:
Divide the flowchart into columns: one column for each internal entity and one for each external entity. Label each column.
Systems flowcharting guideline 2:
Flowchart columns should be laid out so that the flowchart activities flow from left to right, but you should locate columns so as to minimize crosed lines and connectors.
Systems flowcharting guideline 3:
Flowchart logic should flow from top to bottom and from left to right. For clarity, put arrow on all flow lines.
Systems flowcharting gjuideline 4:
Keep the flowchart on one page. If you can't, use multiple pages and connect the pages with off-page connectors.
Systems flowcharting guideline 5:
Within each column, there must be at least one manual process, keying operation, or data store between documents. That is, do not directly connecto documents within the same column. This guideline suggests that you show all the processing that is taking place. For example, if two documents are being attached, include a manual process to show the matching and attaching activities.
Systems flowcharting guideline 6:
When crossing organizational lines (I.e., moving from one column to another), show a document at both ends of the flow line unless the connection is so short that the intent is unambiguous.
Systems flowcharting guideline 7:
Documents or reports printed in a computer facility should be shown in that facility's column first. You can then show the document or report giong to the destination unit.
Systems flowcharting guideline 8:
Documents or reports printed by a centralized computer facility on equipment located in another organizational unit (e.g., a warehouse or a shipping department) should not be shown whithin the computer facility.
Systems flowcharting guideline 9:
Processingwithin an organizational unit on devices such as a PC or computerized cash register should be shown within the unit or as a separate column next to that unit, but not in the central computer facility column.
Systems flowcharting guideline 10:
Sequential processing steps (either computerized or manual) with no delat between them (and resulting from the same input) can be shown as one process or as a sequence of processes.
Systems flowcharting guideline 11:
The only way to get data into or out of a computer data storage unit is through a computer processing rectangle.
For example, if you key data from a source document, you must show a manual keying ymbol, a rectangle or aquare, and then a computer storage unit.
Systems flowcharting guideline 12:
A manual process is not needed to show the sending of a document. The sending should be apparent from the movement of the document itself.
Systems flowcharting guideline 13:
Do not use a manual process to file a document. Just show the document going into the file.
A link to flowcharting examples
Basic Flowchart Symbols
These are the basic system flowchart icons. Many sources on the internet demonstrate proprietary symbols, but this seems to be consistent with our textbook.
For free trials of diagramming software, go to any of the following websites:
http://www.pacestar.com/edge/trial.htm
http://www.smartdraw.com/tutorials/software/dfd/tutorial_01.htm
https://login.quest.com//secure/logon.aspx?siteid=1&sid=3cc0b992-ace0-48c9-9939-0f6f7fcf6dca